Reducing Overheads of Local Communications in Fine-grain Parallel Computation
نویسندگان
چکیده
For fine-grain computation to be ejjective, the cost of communications between the large number of subtasks should be minimized. In this papec we present an optimization technique which reduces overheads of communications between local subtasks by bypassing the network inte$ace and transferring data directly from memory or registers to memory. On average, the optimization results in 35.6% improvement in total execution time on instruction-level simulations with six benchmark programs from I to 32 nodes.
منابع مشابه
Distributed Filaments: Eecient Fine-grain Parallelism on a Cluster of Workstations Distributed Filaments: Eecient Fine-grain Parallelism on a Cluster of Workstations
A ne-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as iterative grid computations, recursive fork/join programs, the bodies of parallel FOR loops, and the implicit parallelism in functional or dataaow languages. It is useful both to describe massively parall...
متن کاملDistributed Filaments: E cient Fine-Grain Parallelism on a Cluster of Workstations
A ne-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as iterative grid computations , recursive fork/join programs, the bodies of parallel FOR loops, and the implicit parallelism in functional or dataaow languages. It is useful both to describe massively paral...
متن کاملCapsules: expressing composable computations in a parallel programming model
A well-known problem in designing high-level parallel programming models and languages is the “granularity problem”, where the execution of parallel task instances that are too fine-grain incur large overheads in the parallel runtime and decrease the speed-up achieved by parallel execution. On the other hand, tasks that are too coarse-grain create load-imbalance and do not adequately utilize th...
متن کاملFine grain parallelism on a MIMD machine using FPGAs
Current MIMD machines are used for coarse grain-parallelism and also ooer messsage passing mechanisms to deal with inter-processor communications. But these mechanisms lack eeciency in ne-grain parallel applications such as systolic computation. This article presents the use of an FPGA chip to set up a fast systolic communication agent on a linear asynchronous network of Transputer processors; ...
متن کاملA Multithreaded Parallel Implementation of a Dynamic Programming Algorithm for Sequence Comparison
This paper discusses the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general-purpose parallel computing platform based on a fine-grain event-driven multithreaded program execution model. Fine-grain multithreading permits efficient parallelism exploitation in this application both by taking advantage of asynchronous point-to-point synch...
متن کامل